Goto

Collaborating Authors

 Toxicology


Combining Deep Learning and Explainable AI for Toxicity Prediction of Chemical Compounds

Popescu, Eduard, Groza, Adrian, Cernat, Andreea

arXiv.org Artificial Intelligence

The task here is to predict the toxicological activity of chemical compounds based on the Tox21 dataset, a benchmark in computational toxicology. After a domain-specific overview of chemical toxicity, we discuss current computational strategies, focusing on machine learning and deep learning. Several architectures are compared in terms of performance, robustness, and interpretability. This research introduces a novel image-based pipeline based on DenseNet121, which processes 2D graphical representations of chemical structures. Additionally, we employ Grad-CAM visualizations, an explainable AI technique, to interpret the model's predictions and highlight molecular regions contributing to toxicity classification. The proposed architecture achieves competitive results compared to traditional models, demonstrating the potential of deep convolutional networks in cheminformatics. Our findings emphasize the value of combining image-based representations with explainable AI methods to improve both predictive accuracy and model transparency in toxicology.


Explainable Molecular Property Prediction: Aligning Chemical Concepts with Predictions via Language Models

Wang, Zhenzhong, Lin, Zehui, Lin, Wanyu, Yang, Ming, Zeng, Minggang, Tan, Kay Chen

arXiv.org Artificial Intelligence

Providing explainable molecule property predictions is critical for many scientific domains, such as drug discovery and material science. Though transformer-based language models have shown great potential in accurate molecular property prediction, they neither provide chemically meaningful explanations nor faithfully reveal the molecular structure-property relationships. In this work, we develop a new framework for explainable molecular property prediction based on language models, dubbed as Lamole, which can provide chemical concepts-aligned explanations. We first leverage a designated molecular representation -- the Group SELFIES -- as it can provide chemically meaningful semantics. Because attention mechanisms in Transformers can inherently capture relationships within the input, we further incorporate the attention weights and gradients together to generate explanations for capturing the functional group interactions. We then carefully craft a marginal loss to explicitly optimize the explanations to be able to align with the chemists' annotations. We bridge the manifold hypothesis with the elaborated marginal loss to prove that the loss can align the explanations with the tangent space of the data manifold, leading to concept-aligned explanations. Experimental results over six mutagenicity datasets and one hepatotoxicity dataset demonstrate Lamole can achieve comparable classification accuracy and boost the explanation accuracy by up to 14.8%, being the state-of-the-art in explainable molecular property prediction.


Explainable machine learning for predicting shellfish toxicity in the Adriatic Sea using long-term monitoring data of HABs

Marzidovšek, Martin, Francé, Janja, Podpečan, Vid, Vadnjal, Stanka, Dolenc, Jožica, Mozetič, Patricija

arXiv.org Artificial Intelligence

In this study, explainable machine learning techniques are applied to predict the toxicity of mussels in the Gulf of Trieste (Adriatic Sea) caused by harmful algal blooms. By analysing a newly created 28-year dataset containing records of toxic phytoplankton in mussel farming areas and toxin concentrations in mussels (Mytilus galloprovincialis), we train and evaluate the performance of ML models to accurately predict diarrhetic shellfish poisoning (DSP) events. The random forest model provided the best prediction of positive toxicity results based on the F1 score. Explainability methods such as permutation importance and SHAP identified key species (Dinophysis fortii and D. caudata) and environmental factors (salinity, river discharge and precipitation) as the best predictors of DSP outbreaks. These findings are important for improving early warning systems and supporting sustainable aquaculture practices.


Hidden Flaws Behind Expert-Level Accuracy of GPT-4 Vision in Medicine

Jin, Qiao, Chen, Fangyuan, Zhou, Yiliang, Xu, Ziyang, Cheung, Justin M., Chen, Robert, Summers, Ronald M., Rousseau, Justin F., Ni, Peiyun, Landsman, Marc J, Baxter, Sally L., Al'Aref, Subhi J., Li, Yijia, Chiang, Michael F., Peng, Yifan, Lu, Zhiyong

arXiv.org Artificial Intelligence

Recent studies indicate that Generative Pre-trained Transformer 4 with Vision (GPT-4V) outperforms human physicians in medical challenge tasks. However, these evaluations primarily focused on the accuracy of multi-choice questions alone. Our study extends the current scope by conducting a comprehensive analysis of GPT-4V's rationales of image comprehension, recall of medical knowledge, and step-by-step multimodal reasoning when solving New England Journal of Medicine (NEJM) Image Challenges - an imaging quiz designed to test the knowledge and diagnostic capabilities of medical professionals. Evaluation results confirmed that GPT-4V outperforms human physicians regarding multi-choice accuracy (88.0% vs. 77.0%, p=0.034). GPT-4V also performs well in cases where physicians incorrectly answer, with over 80% accuracy. However, we discovered that GPT-4V frequently presents flawed rationales in cases where it makes the correct final choices (27.3%), most prominent in image comprehension (21.6%). Regardless of GPT-4V's high accuracy in multi-choice questions, our findings emphasize the necessity for further in-depth evaluations of its rationales before integrating such models into clinical workflows.


Predictive toxicology evolving from in vivo to in vitro to in silico systems

#artificialintelligence

A team of researchers working at the Laboratory for Health Protection of the National Institute of Public Health and the Environment in Bilthoven, The Netherlands, in collaboration with the German Centre for the Protection of Laboratory Animals (Bf3R) from the German Federal Institute for Risk Assessment (BfR) in Berlin, Germany, and the Utrecht Institute of Pharmaceutical Sciences of the Utrecht University, Utrecht, The Netherlands, critically emphasize on the need for microphysiological systems to support the innovations in organoids & organ-on-chip microfluidic devices (Schneider et al., 2021). According to the investigators, the strict evaluation of the potentially toxic effects of certain chemicals, including pharmaceutical compounds, on human and environmental health continues to be tough. The complexity of biological processes and the lack of accessibility to in vivo experiments exacerbate this aspect. Therefore, during the past few years, an increasing number of researchers discovered recurring model systems ranging from single cell lines to complex animal models. During the past five years, microphysiological systems mimicking human physiology on a small scale gained great attention.


A Roadmap to Asymptotic Properties with Applications to COVID-19 Data

Cui, Elvis Han

arXiv.org Artificial Intelligence

A good estimator should, at least in the asymptotic sense, be close to the true quantity that it wishes to estimate and we should be able to give uncertainty measure based on a finite sample size. An estimator with well-behaved asymptotic properties can help clinicians in many ways such as reducing the number of patients needed in a trial, cutting down the budget for toxicology studies and providing insightful findings for late phase trials. Suggested by Sr. Fisher [1], generations of statisticians have worked on the so-called "consistency" and "asymptotic normality" of estimators. The former is based on different versions of law of large numbers (LLN) and the later is based on various types of central limit theorems (CLT) [2]. In addition to these two main tools, statisticians also apply other important but less well-known results in probability theory and other mathematical fields. To name a few, extreme value theory for distributions of maxima and minima [3], convex analysis for checking the optimality of a statistical design [4], asymptotic relative efficiency (ARE) of an estimator [5], concentration inequalities for finite sample properties and selection consistency [6] and other non-normal limits, robustness and simultaneous confidence bands of common statistical estimators [7, 8]. Despite of different properties, consistency and asymptotic normality are still the most celebrated and important properties of statistical estimators in either academia or industry. Hence, in the following, we present a roadmap to consistency and asymptotic normality. Then we provide their applications in toxicology studies and clinical trials using a COVID-19 dataset.


Diagnosis of Acute Poisoning Using Explainable Artificial Intelligence

Chary, Michael, Boyer, Ed W, Burns, Michele M

arXiv.org Artificial Intelligence

Medical toxicology is the clinical specialty that treats the toxic effects of substances, be it an overdose, a medication error, or a scorpion sting. The volume of toxicological knowledge and research has, as with other medical specialties, outstripped the ability of the individual clinician to entirely master and stay current with it. The application of machine learning techniques to medical toxicology is challenging because initial treatment decisions are often based on a few pieces of textual data and rely heavily on prior knowledge. ML techniques often do not represent knowledge in a way that is transparent for the physician, raising barriers to usability. Rule-based systems and decision tree learning are more transparent approaches, but often generalize poorly and require expert curation to implement and maintain. Here, we construct a probabilistic logic network to represent a portion of the knowledge base of a medical toxicologist. Our approach transparently mimics the knowledge representation and clinical decision-making of practicing clinicians. The software, dubbed Tak, performs comparably to humans on straightforward cases and intermediate difficulty cases, but is outperformed by humans on challenging clinical cases. Tak outperforms a decision tree classifier at all levels of difficulty. Probabilistic logic provides one form of explainable artificial intelligence that may be more acceptable for use in healthcare, if it can achieve acceptable levels of performance.


Expert system in clinical toxicology

AITopics Original Links

Compared with other medical fields, clinical toxicology is probably easier to formalise because few heuristics are used and lots of data can be managed in a database, such as storing information about drugs and toxicological classes. From the beginning of the analysis, we have intentionally separated data and knowledge. In SETH, data on drugs, toxicological classes and advice can be updated within the data base application. The maintenance of data on drugs and toxicological classes is performed with an electronic dictionary of French drugs available in our University Hospital and is updated every three months (20). Only reasoning and toxicological classes interaction updates have to be done in the knowledge base.